Efficiently Mining Regional Outliers in Spatial Data

نویسندگان

  • Richard Frank
  • Wen Jin
  • Martin Ester
چکیده

With the increasing availability of spatial data in many applications, spatial clustering and outlier detection has received a lot of attention in the database and data mining community. As a very prominent method, the spatial scan statistic finds a region that deviates (most) significantly from the entire dataset. In this paper, we introduce the novel problem of mining regional outliers in spatial data. A spatial regional outlier is a rectangular region which contains an outlying object such that the deviation between the non-spatial attribute value of this object and the aggregate value of this attribute over all objects in the region is maximized. Compared to the spatial scan statistic, which targets global outliers, our task aims at local spatial outliers. We introduce two greedy algorithms for mining regional outliers, growing regions by extending them by at least one neighboring object per iteration, choosing the extension which leads to the largest increase of the objective function. Our experimental evaluation on synthetic datasets and a real dataset demonstrates the meaningfulness of this new type of outliers and the greatly superior efficiency of the proposed algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Voronoi K-Means Algorithm for Mining Local Crime Spatial Outliers in Spatial Crime Data

Through the boosting accessibility of spatial and temporal data in many research fields, spatial clustering and spatial outlier detection has received a group of concentration in the spatial data mining research. As a very famous method, the CLIQUE Optimization finds a region that deviates significantly from the entire spatial data set. In this paper, we introduce the novel problem of mining cr...

متن کامل

Local multivariate outliers as geochemical anomaly halos indicators, a case study: Hamich area, Southern Khorasan, Iran

Anomaly recognition has always been a prominent subject in preliminary geochemical explorations. Among the regional geochemical data processing, there are a range of statistical and data mining techniques as well as different mapping methods, which serve as presentations of the outputs. The outlier’s values are of interest in the investigations where data are gathered under controlled condition...

متن کامل

On Detecting Spatial Outliers

The ever-increasing volume of spatial data has greatly challenged our ability to extract useful but implicit knowledge from them. As an important branch of spatial data mining, spatial outlier detection aims to discover the objects whose non-spatial attribute values are significantly different from the values of their spatial neighbors. These objects, called spatial outliers, may reveal importa...

متن کامل

Mining Outliers in Spatial Networks

Outlier analysis is an important task in data mining and has attracted much attention in both research and applications. Previous work on outlier detection involves different types of databases such as spatial databases, time series databases, biomedical databases, etc. However, few of the existing studies have considered spatial networks where points reside on every edge. In this paper, we stu...

متن کامل

Grid-ODF: Detecting Outliers Effectively and Efficiently in Large Multi-dimensional Databases

Outlier detection is an important task in data mining that enjoys a wide range of applications such as detections of credit card fraud, criminal activity and exceptional patterns in databases. In recent years, there have been numerous research work in outlier detection and the new notions such as distance-based outliers and density-based local outliers have been proposed. However, the existing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007